Factored Translation Models
نویسندگان
چکیده
We present an extension of phrase-based statistical machine translation models that enables the straight-forward integration of additional annotation at the word-level — may it be linguistic markup or automatically generated word classes. In a number of experiments we show that factored translation models lead to better translation performance, both in terms of automatic scores, as well as more grammatical coherence.
منابع مشابه
Statistical Translation Models: A Literature Survey
In this survey, we briefly study Phrase-based, Factored and Hierarchical translation models. First we learn basics of Phrase-based model. Then we get introduced to an interesting SMT approach called Factored translation models. We also study mathematical modeling of the Factored models. Finally, we compare Factored models with Phrase-based models and know their disadvantages which are pulling t...
متن کاملFactored Translation between Brazilian Portuguese and English
Factored translation is an extension of the state-of-theart phrase-based statistical machine translation (PB-SMT). The main difference in factored translation approach is that a word is not only a token (its surface form) but a vector composed of different information such as lemma, part-of-speech or morphologic/syntactic tags. In this paper we present some experiments carried out to train and ...
متن کاملInterpolated Backoff for Factored Translation Models
We propose interpolated backoff methods to strike the balance between traditional surface form translation models and factored models that decompose translation into lemma and morphological feature mapping steps. We show that this approach improves translation quality by 0.5 BLEU (German–English) over phrase-based models, due to the better translation of rare nouns and adjectives.
متن کاملFactored translation models for enriching spoken language translation with prosody
Key contextual information such as word prominence, emphasis, and contrast is typically ignored in speech-to-speech (S2S) translation due to the compartmentalized nature of the translation process. Conventional S2S systems rely on extracting prosody dependent cues from hypothesized (possibly erroneous) translation output using only words and syntax. In contrast, we propose the use of factored t...
متن کاملFeature Selection for Factored Phrase-Based Machine Translation
In the presented work we investigate factored models for machine translation. We provide a thorough theoretical description of this machine translation paradigm. We describe a method for evaluating the complexity of factored models and verify its usefulness in practice. We present a software tool for automatic creation of machine translation experiments and search in the space of possible confi...
متن کامل